Search CORE

11 research outputs found

PARDIS: A Programmable Memory Controller for the DDRx Interfacing Standards

Author: Engin Ipek
Mahdi Nazm Bojnordi
Publication venue
Publication date: 01/01/2012
Field of study

Modern memory controllers employ sophisticated address mapping, command scheduling, and power management optimizations to alleviate the adverse effects of DRAM timing and resource constraints on system performance. A promising way of improving the versatility and efficiency of these controllers is to make them programmable—a proven technique that has seen wide use in other control tasks ranging from DMA scheduling to NAND Flash and directory control. Unfortunately, the stringent latency and throughput requirements of modern DDRx devices have rendered such programmability largely impractical, confining DDRx controllers to fixed-function hardware. This paper presents the instruction set architecture (ISA) and hardware implementation of PARDIS, a programmable memory controller that can meet the performance requirements of a high-speed DDRx interface. The proposed controller is evaluated by mapping previously proposed DRAM scheduling, address mapping, refresh scheduling, and power management algorithms onto PARDIS. Simulation results show that the performance of PARDIS comes within 8 % of an ASIC implementation of these techniques in every case; moreover, by enabling application-specific optimizations, PARDIS improves system performance by 6-17 % and reduces DRAM energy by 9-22 % over four existing memory controllers.

CiteSeerX

Crossref

Memory system optimizations for energy and bandwidth efficient data movement

Author: Ipek Engin
Nazm Bojnordi Mahdi
Publication venue: University of Rochester
Publication date
Field of study

Thesis (Ph. D.)--University of Rochester. Department of Electrical and Computer Engineering, 2016.Since the early 2000s, power dissipation and memory bandwidth have been two of the most critical challenges that limit the performance of computer systems, from data centers to smartphones and wearable devices. Data movement between the processor cores and the storage elements of the memory hierarchy (including the register file, cache levels, and main memory) is the primary contributor to power dissipation in modern microprocessors. As a result, energy and bandwidth efficiency of the memory hierarchy is of paramount importance to designing high performance and energy-efficient computer systems. This research explores a new class of energy-efficient computer architectures that aim at minimizing data movement, and improving memory bandwidth efficiency. We investigate the design of domain specific ISAs and hardware/software interfaces, develop physical structures and microarchitectures for energy efficient memory arrays, and explore novel architectural techniques for leveraging emerging memory technologies (e.g., Resistive RAM) in energy efficient memory-centric accelerators. This dissertation first presents a novel, energy-efficient data exchange mechanism using synchronized counters. The key idea is to represent information by the delay between two consecutive pulses on a set of wires connecting the data arrays to the cache controller. This time-based data representation makes the number of state transitions on the interconnect independent of the bit patterns, and significantly lowers the activity factor on the interconnect. Unlike the case of conventional parallel or serial data communication, however, the transmission time of the proposed technique grows exponentially with the number of bits in each transmitted value. This problem is addressed by limiting the data blocks to a small number of bits to avoid a significant performance loss. A viable hardware implementation of the proposed mechanism is presented that incurs negligible area and delay overheads. The dissertation then examines the first fully programmable DDRx controller that enables application specific optimizations for energy and bandwidth efficient data movement between the processor and main memory. DRAM controllers employ sophisticated address mapping, command scheduling, and power management optimizations to alleviate the adverse effects of DRAM timing and resource constraints on system performance. These optimizations must satisfy different system requirements, which complicates memory controller design. A promising way of improving the versatility and energy efficiency of these controllers is to make them programmable—a proven technique that has seen wide use in other control tasks ranging from DMA scheduling to NAND Flash and directory control. Unfortunately, the stringent latency and throughput requirements of modern DDRx devices have rendered such programmability largely impractical, confining DDRx controllers to fixed-function hardware. The proposed programmable controller employs domain specific ISAs with associative search instructions, and carefully partitions tasks between specialized hardware and firmware to meet all the requirements for high performance DRAM management. Finally, this dissertation presents the memristive Boltzmann machine, a novel hardware accelerator that leverages in situ computation with RRAM technology to eliminate unnecessary data movement on combinatorial optimization and deep learning workloads. The Boltzmann machine is a massively parallel computational model capable of solving a broad class of combinatorial optimization problems and training deep machine learning models on massive datasets. Regrettably, the required all-to-all communication among the processing units limits the performance of the Boltzmann machine on conventional memory architectures. The proposed accelerator exploits the electrical properties of RRAM to realize in situ, fine-grained parallel computation within the memory arrays, thereby eliminating the need for exchanging data between the memory cells and the computational units. Two classical optimization problems, graph partitioning and boolean satisfiability, and a deep belief network application are mapped onto the proposed hardware

UR Research

MB-CNN: Memristive Binary Convolutional Neural Networks for Embedded Mobile Devices

Author: Arjun Pal Chowdhury
Mahdi Nazm Bojnordi
Pranav Kulkarni
Publication venue: 'MDPI AG'
Publication date: 01/10/2018
Field of study

Applications of neural networks have gained significant importance in embedded mobile devices and Internet of Things (IoT) nodes. In particular, convolutional neural networks have emerged as one of the most powerful techniques in computer vision, speech recognition, and AI applications that can improve the mobile user experience. However, satisfying all power and performance requirements of such low power devices is a significant challenge. Recent work has shown that binarizing a neural network can significantly improve the memory requirements of mobile devices at the cost of minor loss in accuracy. This paper proposes MB-CNN, a memristive accelerator for binary convolutional neural networks that perform XNOR convolution in-situ novel 2R memristive data blocks to improve power, performance, and memory requirements of embedded mobile devices. The proposed accelerator achieves at least 13.26 × , 5.91 × , and 3.18 × improvements in the system energy efficiency (computed by energy × delay) over the state-of-the-art software, GPU, and PIM architectures, respectively. The solution architecture which integrates CPU, GPU and MB-CNN outperforms every other configuration in terms of system energy and execution time

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

AxMAP: Making Approximate Adders Aware of Input Patterns

Author: Mahdi Nazm Bojnordi
Masoud Dehyadegari
Mohammad Rezaalipour
Morteza Rezaalipour
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Sanitizer: Mitigating the Impact of Expensive ECC Checks on STT-MRAM Based Main Memories

Author: Engin Ipek
Mahdi Nazm Bojnordi
Qing Guo
Xiaochen Guo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Content Aware Refresh: Exploiting the Asymmetry of DRAM Retention Errors to Reduce the Refresh Frequency of Less Vulnerable Data

Author: Engin Ipek
Mahdi Nazm Bojnordi
Shibo Wang
Xiaochen Guo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Accelerating <inline-formula> <tex-math notation="LaTeX"> $k$ </tex-math> </inline-formula>-Medians Clustering Using a Novel 4T-4R RRAM Cell

Author: Goverdhan Reddy Pandla
Mahdi Nazm Bojnordi
Manikanth Miryala
Payman Behnam
Yomi Karthik Rupesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

PARDIS

Author: Engin Ipek
Jacob B. L.
Kim Y.
Mahdi Nazm Bojnordi
Martin J.
Wilton S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref